Cross-framework parser stacking for data-driven dependency parsing

نویسندگان

  • Lilja Øvrelid
  • Jonas Kuhn
  • Kathrin Spreyer
چکیده

In this article, we present and evaluate an approach to the combination of a grammardriven and a data-driven parser which exploits machine learning for the acquisition of syntactic analyses guided by both parsers. We show how conversion of LFG output to dependency representation allows for a technique of parser stacking, whereby the output of the grammar-driven parser supplies features for a data-driven dependency parser. We evaluate on English and German and show significant improvements in overall parse results stemming from the proposed dependency structure as well as other linguistic features derived from the grammars. Finally, we perform an application-oriented evaluation and explore the use of the stacked parsers as the basis for the projection of dependency annotation to a new language. RÉSUMÉ. Dans cet article, nous présentons et évaluons une approche permettant de combiner un analyseur fondé sur une grammaire et un analyseur fondé sur des données, en utilisant des méthodes d’apprentissage automatique pour produire des analyses syntaxiques guidées par les deux analyseurs. Nous montrons comment la conversion de la sortie d’un analyseur LFG en une représentation en dépendances permet d’utiliser une technique d’empilement d’analyseurs ("parser stacking"), dans laquelle la sortie de l’analyseur fondé sur une grammaire fournit des caractéristiques utilisables par un analyseur fondé sur les données. Nous évaluons notre approche sur l’anglais et l’allemand, et montrons des améliorations significatives pour les résultats d’analyses syntaxiques complètes qui découlent de l’analyse en dépendances ainsi que des caractéristiques provenant de grammaires. Enfin, nous procédons à une évaluation dédiée à une application, et explorons l’utilisation de cet empilement d’analyseurs comme point de départ pour l’annotation en dépendances d’une nouvelle langue.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Improving data-driven dependency parsing using large-scale LFG grammars

This paper presents experiments which combine a grammar-driven and a datadriven parser. We show how the conversion of LFG output to dependency representation allows for a technique of parser stacking, whereby the output of the grammar-driven parser supplies features for a data-driven dependency parser. We evaluate on English and German and show significant improvements stemming from the propose...

متن کامل

Bilexical Dependencies as an Intermedium for Data-Driven and HPSG-Based Parsing

Bilexical dependencies capturing asymmetrical lexical relations between heads and dependents are viewed as a practical representation of syntax that is well-suited for computation and intelligible for human readers. In the present work we use dependency representations as a bridge between data-driven and grammar-based parsing, both for cross-framework parser comparison and for parser integratio...

متن کامل

Analyzing and Integrating Dependency Parsers

There has been a rapid increase in the volume of research on data-driven dependency parsers in the past five years. This increase has been driven by the availability of treebanks in a wide variety of languages—due in large part to the CoNLL shared tasks—as well as the straightforward mechanisms by which dependency theories of syntax can encode complex phenomena in free word order languages. In ...

متن کامل

تأثیر ساخت‌واژه‌ها در تجزیه وابستگی زبان فارسی

Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • TAL

دوره 50  شماره 

صفحات  -

تاریخ انتشار 2009